Voting-Based Multiagent Reinforcement Learning for Intelligent IoT
نویسندگان
چکیده
The recent success of single-agent reinforcement learning (RL) in Internet Things (IoT) systems motivates the study multiagent RL (MARL), which is more challenging but useful large-scale IoT. In this article, we consider a voting-based MARL problem, agents vote to make group decisions and goal maximize globally averaged returns. To end, formulate problem based on linear programming form policy optimization propose primal-dual algorithm obtain optimal solution. We also voting mechanism through distributed achieves same sublinear convergence rate as centralized learning. other words, decision making does not slow down process achieving global consensus optimality. Finally, verify our proposed with numerical simulations conduct case studies practical IoT systems.
منابع مشابه
Transfer Learning for Multiagent Reinforcement Learning Systems
Reinforcement learning methods have successfully been applied to build autonomous agents that solve many sequential decision making problems. However, agents need a long time to learn a suitable policy, specially when multiple autonomous agents are in the environment. This research aims to propose a Transfer Learning (TL) framework to accelerate learning by exploiting two knowledge sources: (i)...
متن کاملAsymmetric Multiagent Reinforcement Learning
A novel model for asymmetric multiagent reinforcement learning is introduced in this paper. The model addresses the problem where the information states of the agents involved in the learning task are not equal; some agents (leaders) have information how their opponents (followers) will select their actions and based on this information leaders encourage followers to select actions that lead to...
متن کاملPotential-based difference rewards for multiagent reinforcement learning
Difference rewards and potential-based reward shaping can both significantly improve the joint policy learnt by multiple reinforcement learning agents acting simultaneously in the same environment. Difference rewards capture an agent’s contribution to the system’s performance. Potential-based reward shaping has been proven to not alter the Nash equilibria of the system but requires domain-speci...
متن کاملA multiagent architecture for concurrent reinforcement learning
In this paper we propose a multiagent architecture for implementing concurrent reinforcement learning, an approach where several agents, sharing the same environment, perceptions and actions, work towards one only objective: learning a single value function. We present encouraging experimental results derived from the initial phase of our research on the combination of concurrent reinforcement ...
متن کاملScalable Bayesian Reinforcement Learning for Multiagent POMDPs
Bayesian methods for reinforcement learning (RL) allow model uncertainty to be considered explicitly and offer a principled way of dealing with the exploration/exploitation tradeoff. However, for multiagent systems there have been few such approaches, and none of them apply to problems with state uncertainty. In this paper, we fill this gap by proposing a Bayesian RL framework for multiagent pa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Internet of Things Journal
سال: 2021
ISSN: ['2372-2541', '2327-4662']
DOI: https://doi.org/10.1109/jiot.2020.3021017